Skip to content

Conversation

@skyw
Copy link
Contributor

@skyw skyw commented Oct 31, 2025

Couple of changes (subject to change further before out of Draft):

  • Update is not skipped the first step out of adam warmup
  • Current gradient is used in preconditioning
  • First Kronecker factor is updated with Shamoo beta
  • Explicitly differentiate 0-based and 1-based step count for different purposes.

Also pulled in https://github.com/nikhilvyas/SOAP as testing reference. Code is shuffled to match updated algorithmic choices. A version matching unmodified (almost) reference is archived in branch https://github.com/NVIDIA-NeMo/Emerging-Optimizers/tree/skyw/soap_vs_reference_backup_donot_merge

@skyw skyw requested a review from mkhona-nvidia October 31, 2025 21:41
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 31, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@skyw
Copy link
Contributor Author

skyw commented Oct 31, 2025

/ok to test 2b05036

Copy link
Contributor

@mkhona-nvidia mkhona-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soap.py LGTM

@skyw
Copy link
Contributor Author

skyw commented Nov 4, 2025

/ok to test e302bf7

Copy link
Contributor

@mkhona-nvidia mkhona-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@skyw
Copy link
Contributor Author

skyw commented Nov 5, 2025

/ok to test 53cb513

@skyw skyw marked this pull request as ready for review November 5, 2025 17:54
@skyw skyw enabled auto-merge (squash) November 5, 2025 17:54
@skyw skyw merged commit 16f8399 into main Nov 5, 2025
21 of 23 checks passed
@skyw skyw deleted the skyw/soap_further_cleanup branch November 5, 2025 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants